A Cache Design of SSD-based Search Engine Architectures: An Experimental Study

نویسندگان

  • JIANGUO WANG
  • GANG WANG
  • XIAOGUANG LIU
چکیده

Caching is an important optimization in search engine architectures. Existing caching techniques for search engine optimization are mostly biased towards the reduction of random accesses to disks, because random accesses are known to be much more expensive than sequential accesses in traditional magnetic hard disk drive (HDD). Recently, solid state drive (SSD) has emerged as a new kind of secondary storage medium, and some search engines like Baidu have already used SSD to completely replace HDD in their infrastructure. One notable property of SSD is that its random access latency is comparable to its sequential access latency. Therefore, the use of SSDs to replace HDDs in a search engine infrastructure may void the cache management of existing search engines. In this paper, we carry out a series of empirical experiments to study the impact of SSD on search engine cache management. Based on the results, we give insights to practitioners and researchers on how to adapt the infrastructure and caching policies for SSD-based search engines.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EDS: An Efficient Data Selection policy for search engine storage architectures

Caching is an effective optimization in search engine storage architectures. Many caching algorithms have been proposed to improve retrieval performance. The data selection policy of search engine cache management plays an important role, which carefully places the data in memory or other storage, such as solid state disks (SSDs). Considering that the historical query log has a guiding role for...

متن کامل

A Comparing between the impacts of text based indexing and folksonomy on ranking of images search via Google search engine

Background and Aim: The purpose of this study was to compare the impact of text based indexing and folksonomy in image retrieval via Google search engine. Methods: This study used experimental method. The sample is 30 images extracted from the book “Gray anatomy”. The research was carried out in 4 stages; in the first stage, images were uploaded to an “Instagram” account so the images are tagge...

متن کامل

An Efficient Design and Implementation of Multi-level Cache for Database Systems

Flash-based solid state device(SSD) is making deep inroads into enterprise database applications due to its faster data access. The capacity and performance characteristics of SSD make it well-suited for use as a second-level buffer cache. In this paper, we propose a SSD-based multilevel buffer scheme, called flash-aware second-level cache(FASC), where SSD serves as an extension of the DRAM buf...

متن کامل

Query-Driven Indexing in Large-Scale Distributed Systems

Efficient and effective search in large-scale data repositories requires complex indexing solutions deployed on a large number of servers. Web search engines such as Google and Yahoo! already rely upon complex systems to be able to return relevant query results and keep processing times within the comfortable sub-second limit. Nevertheless, the exponential growth of the amount of content on the...

متن کامل

GREM: Dynamic SSD Resource Allocation In Virtualized Storage Systems With Heterogeneous VMs

In a shared virtualized storage system that runs heterogeneous VMs with diverse IO demands, it becomes a problem for the hypervisor to cost-effectively partition and allocate SSD resources among multiple VMs. There are two straightforward approaches to solve this problem: either equally assigning SSDs to each VM or managing SSD resources in a fair competition mode. Unfortunately, they cannot fu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014